Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fluent integration #211

Closed
wants to merge 29 commits into from
Closed

Fluent integration #211

wants to merge 29 commits into from

Conversation

kazimuth
Copy link
Contributor

Starting the work discussed in #202, not functional in any way yet.

My current implementation will end up baking the compiled templates into result executables to keep deployment simple, although I think I'll also add in the ability to reload translation files at runtime somehow.

@kazimuth
Copy link
Contributor Author

kazimuth commented Feb 17, 2019

Here's a rendered draft of what I want the final API to look like (from the docs I'm writing). This API is a little magical but I think it should end up nice to use. Thoughts?

Internationalization with Fluent

Enabling the with-i18n feature enables askama's internationalization / localization (i18n / l10n)
functionality, which you can use to provide translations of your templates into other languages.

Askama's i18n relies on Fluent to parse and apply translations.

Using i18n tries to be as simple as possible, but it's still a little involved.

Create a folder structure at your project root, organized like so:

  • i18n
    • en-US
      • greeting.ftl
    • es-MX
      • greeting.ftl

Add some localizations in Fluent's FTL Syntax:

i18n/en-US/greeting.ftl:

hello = Hello, $name!
age.tracker = You are $age_hours hours old.

i18n/es-MX/greeting.ftl:

hello = ¡Hola, $name!
age.tracker = Tiene $age_hours horas.

Call the init_askama_i18n!() macro at your crate root:

extern crate askama;

init_askama_i18n!();

This will create a module askama_i18n which contains resources used by
the localization system. See the documentation for the init_askama_i18n! macro
for customization options. Note that, by default, translations you provide will be
compiled into the output executable, to ease deployment; all you need is one binary.

Now, on templates you want to localize, add a member locale:

#[derive(Template)]
#[template(path = "hello.html")]
struct HelloTemplate<'a> {
    locale: &'a str,
    name: &'a str,
    age_hours: f32,
}

And now you can use the localize filter in your templates:

<h1>{ localize(hello, name: name) }</h1>
<h3>{ localize(age.tracker, age_hours: age_hours) }</h1>

(TODO: should we automatically pass eligible template variables
into the localization system?)

Now, your template will be automatically translated when rendered:

println!("{}", HelloTemplate { locale: "en-US", name: "Jamie", age_hours: 195_481.8 });

<h1>Hello, Jamie!</h1>
<h3>You are 195,481.8 hours old.</h1>

println!("{}", HelloTemplate { locale: "es-MX", name: "Jamie", age_hours: 195_481.8 });

<h1>¡Hola, Jamie!</h1>
<h3>Tienes 195.481,8 horas.</h1>

It's up to you to pass the correct locale in for each user.

@djc
Copy link
Collaborator

djc commented Feb 17, 2019

Thanks for working on this! Notes so far:

  • The global macro call doesn't fit that well with the design so far. Have you thought about ways to get by without it?

  • Folder structure looks good.

  • Not sure about the design of having locale being a member of the context struct. This mixes the application's namespace with Askama's, which is something I've so far tried to prevent.

  • The localize filter uses named arguments, but those are so far not supported in Askama (largely because there's no native support in Rust). Not sure how necessary that is, or if you've thought about what code those should generate?

  • You seem to have a lot of _ prefixed items, which as far as I know is uncommon in Rust. What are you trying to do with those?

@kazimuth
Copy link
Contributor Author

kazimuth commented Feb 17, 2019

Glad to, it's a fun little project :)

  • The global macro call is sort of ugly, but I believe it's necessary if we want to embed the localizations directly in the target executable. Afaict a global macro call is the only way to efficiently embed data (the .ftl files) in an executable. We could also inline the files for every template, but that would bloat things; or we could delegate loading localization files entirely to the user, and let them find the files at runtime. I like the global macro call because it keeps deployment simple, but it is sort of magic. I prefer easy-to-use magic to hard-to-use non-magic, but I know that's a matter of taste.

    • Also note that you could have multiple init_askama_i18n!() calls in different modules and use separate localizations in separate templates; I could make the path to the embedded localizations an optional argument to localize. I don't think that'll be a common use case though.
  • I'm happy to put locale somewhere else but I'm not sure where it should go exactly. As an argument to render?

  • the localize named arguments are bikesheddable. If present they'd be syntactic sugar for creating a HashMap mapping argument names to values, which is what fluent expects. We could also use tuples or some other form of sugar; or just pass everything in a template directly to fluent. The problem is that fluent doesn't expect ordered arguments, just key-value mappings, so there needs to be some way to represent that.

  • the _ prefix items are an API designed to be consumed by the init_askama_i18n! macro but not used by users directly -- the codegen that uses them doesn't exist yet. I could also just inline them to the code generated by the macro.

@djc
Copy link
Collaborator

djc commented Feb 17, 2019

The question about how template implementations will find where the globals are defined is a real issue, though. Have you thought about how that might work?

I don't think we want an extra argument to render() since that would require every user to care about localization (and thus seems like fairly bad API breakage for every current user). It could potentially be an Option-wrapped argument to render_into(), in which case it would probably not break most users but also prevents localization users from using the easier render() API. Maybe the locale field isn't so bad after all, it's just doesn't fit neatly into the design so far.

If Fluent requires a HashMap it probably makes sense to just do named arguments. It might be nice if there was a way that didn't require the allocations inherent in that data structure, though. (But that's arguably an optimization that we can postpone.) Just be aware that there'll be some work in fixing the code generator to call the filter correctly. Also why not just call localize() i18n instead?

Let's stay away from _-prefixing for now, please, and review at the end whether it makes sense to introduce it again for some things.

@kazimuth
Copy link
Contributor Author

I was just thinking the localize template would always call ::askama_i18n_generated::localize(), sorta like how filters look for local filters modules, but starting at the root of the current crate. Could allow calling other localize filters too, of course. Or maybe the right way to do it is to just have an askama_i18n_generated::filters module and let the user use askama_i18n_generated::filters; whenever they want to access localize?

Maybe the locale field isn't so bad after all, it's just doesn't fit neatly into the design so far.

There's definitely some lack of separation of concerns in here, 🤷‍♀️ what can you do.

If Fluent requires a HashMap it probably makes sense to just do named arguments. It might be nice if there was a way that didn't require the allocations inherent in that data structure, though.

Yeah that's a problem inherent in the fluent API, it's very allocation-y. Also doesn't take a Writer but returns Strings :/
Could be a thing worth doing a PR for. According to the fluent benchmarks FluentBundle.format currently takes about 500 ns, which isn't terrible, but could probably be sped up.

@kazimuth
Copy link
Contributor Author

kazimuth commented Feb 18, 2019

Ok a few more thoughts on how to find the translations:

What if localize() automatically looked in an i18n module that had to be in scope, like filters? I think that would fit with the design a little better.

We could also just make a macro you call in your filters module that creates a localize() filter, but since it seems like localize will need some magic syntax this seems more fragile.

@djc
Copy link
Collaborator

djc commented Feb 18, 2019

Yeah, I quite like the idea of letting the filter look into an i18n module that must be in scope. Let's do that.

Fluent returning Strings definitely sounds like it might be nice to fix upstream, in our performance work the intermediate allocations really adversely affected performance.

@djc
Copy link
Collaborator

djc commented Feb 19, 2019

Was wondering this morning if we could do a custom derive instead of the other macro thing? So:

#[derive(Localization)]
struct I18n;

And then all the generated stuff would be consts and associated types from a trait impl for that thing? That would at least make it clearer what it exposes (by looking at the trait definition). Could also expose a particular type that's used for the locale field, so users would pass I18n::locale("en-US").

@djc
Copy link
Collaborator

djc commented Feb 19, 2019

Or, really next-level, we could route localized template rendering through that interface (so I18n::render("en-US"). What do you think?

@Weasy666
Copy link
Contributor

Weasy666 commented Feb 19, 2019

What do you think about this?

fn main() {
    let hello = HelloTemplate { name: "world" }; // instantiate your struct
    println!("{}", hello.locale("en-US").render().unwrap()); // then render it in english
}

You can import the local() function with the with-i18n flag. If imported and not called, the template defaults to english and if not imported, nothing changes.
I don't know if that is possible, haven't look at the code, but i think this would be a nice way to use it.

@kazimuth
Copy link
Contributor Author

We could make it a trait and the codegen would be pretty much the same; there need to be some lazy_statics behind the scenes but we can put those in a hidden module. The user won't really ever need to call methods on the trait or construct the trait manually, though, so I'm not sure it's worth it. Not quite sure how I18n::render("en-US") would work - how does the template interact with the i18n there?

One new idea I had was having a #[locale] attribute in the template struct:

struct HelloTemplate<'a> {
   name: &'a str,
   #[locale]
   locale: &'a str
}

That makes it clear that that particular struct member is "special", and is actually pretty extensible if we ever wanna pass other stuff into templates, sorta halfway to a builder pattern.

template.locale("xy-ZW").render(...) is also workable, we could add a render_with_locale method to Template and add a TemplateWithLocale<T: Template> struct that passes the locale in... i kinda like #[locale] more tho

@kazimuth
Copy link
Contributor Author

random thought: @djc, how would you feel about a short syntax for localization? it's already something the parser knows about anyway.

like instead of {{ localize(message.attr, arg1: v1) }} you could have {{ @message.attr(arg1: v1) }} or something. Might be convenient if you're localizing a lot of messages. On the other hand, not necessarily obvious.

@kazimuth
Copy link
Contributor Author

I made it so that your build fails if your localizations have syntax errors. Could also make it just a warning, fluent degrades gracefully.

another fun thing: you can now run cargo test i18n_coverage -- --nocapture in an askama project with i18n to get output like the following:

askama-i18n-coverage: translated messages and attributes per locale:
askama-i18n-coverage: en-US      33% (10/30)
askama-i18n-coverage: es-MX     100% (30/30)
askama-i18n-coverage: help: to get accurate results, make sure that messages
 not used directly by your software are prefixed with a dash (`-`).
test i18n::tests::i18n_coverage ... ok

@djc
Copy link
Collaborator

djc commented Feb 21, 2019

So I think I want to require that the user gives the global object an explicit name in some way -- though I'm not yet sure what the best way to go about that is. And then I also want localized template rendering to explicitly pull in both the global and the locale name. Within those constraints I'm not yet sure what the best way to do it might be.

I do like the idea of dedicating some specific syntax to localized expressions, but I think we should punt on that for this PR and first get the basics working with the localize() function call.

@kellenff
Copy link
Contributor

kellenff commented Feb 21, 2019

I apologize if I'm late to the party, but would a typed approach work instead of a filter? I understand if people prefer to have the filter syntax settled; but if not, below is what I'm thinking:


Accessing fluent definitions from Rust

Fluent message is a struct with any possible variables as fields. An enum with the available locales would also be generated, implementing From<Box<str>> so that they could still be referred to by e.g. "en-US". A trivial example would be:

i18n/en-US/greeting.ftl:

hello = Hello, $name!
age.trcker = You are $age_hours hours old.

i18n/es-MX/greeting.ftl:

hello = ¡Hola, $name!
age.tracker = Tiene $age_hours horas.

Generated src/i18n.rs:

// Generated code should be edition 2015 compatible
use ::askama::Localize;

#[allow(non_camel_case_types)]
pub enum Locales {
    enUS,
    esMX,
}

impl From<Box<str>> for Locales {
    fn from(s: Box<str>) -> Locales {
        match s.as_ref() {
            "en-US" => Locales::enUS,
            "es-MX" => Locales::esMX,
            other => panic!("{} is not a locale in your i18n directory.", other),
        }
    }
}

pub struct Hello<'a> {
    pub name: &'a str,
}

impl<'a> Localize for Hello<'a> {
    type Locales = Locales;
    
    fn to_locale(&self, locale: Self::Locales) -> String {
        match locale {
            Locales::enUS => format!("Hello, {name}", name=self.name),
            Locales::esMX => format!("¡Hola, {name}", name=self.name),
        }
    }
}

pub struct AgeTracker<'a> {
    pub age_hours: &'a str,
}

impl<'a> Localize for AgeTracker<'a> {
    type Locales = Locales;
    
    fn to_locale(&self, locale: Self::Locales) -> String {
        match locale {
            Locales::enUS => format!("You are {age_hours} hours old", age_hours=self.age_hours),
            Locales::esMX => format!("Tiene {age_hours} horas", age_hours=self.age_hours)
        }
    }
}

You'd then use localizations like so in your template contexts:

templates/greeting.html

<h2>{ hello }</h2>
<h6>{ age_tracker }</h6>

src/lib.rs

use askama::Template;
use crate::i18n::{Locales, Hello, AgeTracker};

#[derive(Template)]
#[template(path="greeting.html")]
pub struct Greeting<'a> {
	hello: Hello<'a>
	age_tracker: AgeTracker<'a>
}

#[test]
fn localization_works() {
	let expected_english = "<h2>Hello, Alice</h2>
<h6>You are 157680 hours old</h6>";
	let expected_spanish = "<h2>¡Hola, Alice</h2>
<h6>Tiene 157680 horas</h6>";

	let greeting = Greeting {
		name: Alice,
		age_tracker: 157680,
	};
	
	// let english = greeting.with_locale("en-US").render().unwrap() will also work
	let english = greeting.with_locale(Locales::enUS).render().unwrap();
	let spanish = greeting.with_locale(Locales::esMX).render().unwrap():
	
	assert_eq!(&english, expected_english);
	assert_eq!(&spanish, expected_spanish);
}

This uses the builder-pattern mentioned by @kazimuth in this comment. The TemplateWithLocale struct renders fields which implement Localize by calling to_locale using the Locale enum provided via the builder method.

Where to put the generated i18n code

Using something like diesel print-schema to generate the i18n module might be an option to ensure that the localizations are visible. The location of the generated module file would be configurable via askama.toml like diesel.toml. Then the generated module can be used like any other Rust code.


I hope this doesn't wrench the conversation off-topic too much. I appreciate all of the work you're putting into this, @kazimuth!

@Deedasmi
Copy link

Deedasmi commented Feb 21, 2019

It looks like you're only able to translate text in the structs for that. The problem with that is that not all text is contained within the struct. Plenty of 'hard-coded' text in the html files also needs a way to call out for localization.

@kellenff
Copy link
Contributor

@Deedasmi I must be missing something. It seems that the point of Fluent is to have all translated text in these definition files. Each "phrase" would then be translated into a Rust data structure. If it's a string literal, then you could use an empty struct that implements Localize returning the defined string literals. The apps I've worked on that are localized use this approach with essentially a huge map of terms, which would translate to Rust types in this implementation. I don't see a meaningful advantage of localizations.get("name", arg1, arg2) vs localizations::name { arg1, arg2 }, at least.

@Deedasmi
Copy link

Deedasmi commented Feb 21, 2019

I could also very easily be missing something. But the rest of the conversation ties the locale to the template, which would allow things like <p>{{ localize(website_intro) }}</p>, and toss that into any old HTML. As long as the template doing the rendering supports translation, it works. In your version, website_intro has to be a struct member on every template struct that uses renders that translation. That's a lot dev overhead.

@kellenff
Copy link
Contributor

@Deedasmi Ah, that's a good point. The #[locale] attribute mentioned here could provide an easy way to get at general terms which don't require arguments. If the arguments have to be passed to the template anyway, then I think it's a little bit harder to tell.

@kellenff
Copy link
Contributor

I mostly proposed the idea because conceptually it just seems so cool, but the ergonomics probably aren't there, at least as far as I can figure out on my own 😜

@djc
Copy link
Collaborator

djc commented Mar 6, 2019

Sounds very cool! Sorry I haven't been as active here recently, it's very busy. Please rest assured that I'm still interested and will review this more closely as it gets closer to being mergeable.

@kazimuth
Copy link
Contributor Author

This is pretty much ready to merge.

Current cargo docs:


Internationalization with Fluent

Enabling the with-i18n feature enables askama's internationalization / localization (i18n / l10n)
functionality, which you can use to provide translations of your templates into other languages.

Askama's i18n relies on Fluent to parse and apply translations.

Using i18n tries to be as simple as possible, but it's still a little involved.

Create a folder structure at your project root, organized like so:

  • i18n
    • en_US
      • greeting.ftl
    • es_MX
      • greeting.ftl

Add some localizations in Fluent's FTL Syntax:

i18n/en_US/greeting.ftl:

hello = Hello, $name!
age.tracker = You are $age_hours hours old.

i18n/es_MX/greeting.ftl:

hello = ¡Hola, $name!
age.tracker = Tiene $age_hours horas.

Call the impl_localize!() macro at your crate root:

extern crate askama;
use askama::{Localize, impl_localize};

impl_localize! {
    #[localize(path = "i18n", default_locale = "en_US")]
    pub struct AppLocalizer(_);
}

This creates a struct called AppLocalizer which implements the askama Localize trait.

This will bake translations you provide into the output executable, to ease
deployment; all you need is one binary.

(Note: tests will be autogenerated to ensure that default_locale has full coverage of messages
used in the application. For this reason, you should probably use a language everyone on your dev team
speaks for the default locale.)

Now, on templates you want to localize, add a member of type AppLocalizer
annotated with the #[localizer] attribute:

#[derive(Template)]
#[template(path = "hello.html")]
struct HelloTemplate<'a> {
    #[localizer]
    localizer: AppLocalizer,
    name: &'a str,
    age_hours: f32,
}

And now you can use the localize filter in your templates:

<h1>{{ localize(hello, name: name) }}</h1>
<h3>{{ localize(age.tracker, age_hours: age_hours) }}</h1>

Now, your template will be automatically translated when rendered:

println!("{}", HelloTemplate {
    localizer: AppLocalizer::new(Some("en_US"), None),
    name: "Jamie",
    age_hours: 195_481.8
});
/*
<h1>Hello, Jamie!</h1>
<h3>You are 195,481.8 hours old.</h1>
*/
println!("{}", HelloTemplate {
    localizer: AppLocalizer::new(Some("es_MX"), None),
    name: "Jamie",
    age_hours: 195_481.8
});
/*
<h1>¡Hola, Jamie!</h1>
<h3>Tienes 195.481,8 horas.You are 195,481.8 hours old.</h1>
*/

To generate a coverage report, run:
cargo test i18n_coverage -- --nocapture

This will report on the percent of messages translated in each locale.

@kazimuth
Copy link
Contributor Author

kazimuth commented Apr 4, 2019

Ping @djc

@kazimuth
Copy link
Contributor Author

Hey @djc, i think this is ready to merge. Are there more changes you want me to make?

@djc
Copy link
Collaborator

djc commented Apr 10, 2019

Sorry - I've just been super busy. It's not forgotten and near the top of my OSS TODO list. Should've let you know sooner...

@kazimuth
Copy link
Contributor Author

no prob, i totally understand :) just let me know

@djc
Copy link
Collaborator

djc commented May 3, 2019

So I started taking a look at this, but this will need a bunch more work for me to be able to really review this. I hope you're up for that? I'm really sorry it's taken so long, but the size and complexity of this has turned this review into a chore for which I need to really gather time and energy and priority.

I prefer to review commit-by-commit and ideally want as-small-as-possible logically atomic commits. I'm not yet sure what a logical sequencing might be? I noticed some commits that change the parser (without adding anything), maybe we should start with that, in a separate PR? The hardest part with reviewing this PR currently is also the changes to earlier commits in later commits, so maybe some squashing could already help.

Maybe we could take the big init!() macro next, in general figuring out what global context the filter needs and how we can make the API for that make sense.

Also it would help if you can rebase this on master rather than having a merge commit in there.

@kazimuth
Copy link
Contributor Author

kazimuth commented May 3, 2019

No problem, I totally understand! I can rebase this and split it up to be more comprehensible. There should be some opportunities to simplify + add tests and docs, as well.

The basic chunks:

  1. The askama_shared::i18n module. This is runtime code for loading Fluent files and looking up translations. Right now it's extra tangled because I was implementing some functionality Fluent didn't offer (looking through a chain of languages to find a translation); they've since implemented that, so i think that can be ripped out.
  2. The big init! macro. This basically just creates a struct implementing Localize that calls into the askama_shared::i18n module. It also generates some tests that make sure the fluent source files don't have errors.
  3. The modifications to the parser, that ingest localize() calls.
  4. The modifications to the codegen. They just emit calls to methods of the Localize trait; they also generate tests that make sure that there are translations are available for the requested strings.

I can do these in whatever order. Should I open separate PRs for them, or make a chain of commits in a single PR?

@kazimuth
Copy link
Contributor Author

kazimuth commented May 3, 2019

Closing in favor of another PR.

@kazimuth kazimuth closed this May 3, 2019
@djc
Copy link
Collaborator

djc commented May 4, 2019

One PR per chunk might save you on rebasing, so let's do that? Also given my track record dealing with this large PR, let's go the safe route of multiple smaller ones...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants